Search CORE

25 research outputs found

MODBASE, a database of annotated comparative protein structure models and associated resources.

Author: Barkan David T
Carter Hannah
Davis Fred P
Eramian David
Eswar Narayanan
Karchin Rachel
Kelly Libusha
Mankoo Parminder
Marti-Renom Marc A
Pieper Ursula
Sali Andrej
Webb Ben M
Publication venue: eScholarship, University of California
Publication date: 23/10/2008
Field of study

MODBASE (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by MODPIPE, an automated modeling pipeline that relies primarily on MODELLER for fold assignment, sequence-structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE currently contains 5,152,695 reliable models for domains in 1,593,209 unique protein sequences; only models based on statistically significant alignments and/or models assessed to have the correct fold are included. MODBASE also allows users to calculate comparative models on demand, through an interface to the MODWEB modeling server (http://salilab.org/modweb). Other resources integrated with MODBASE include databases of multiple protein structure alignments (DBAli), structurally defined ligand binding sites (LIGBASE), predicted ligand binding sites (AnnoLyze), structurally defined binary domain interfaces (PIBASE) and annotated single nucleotide polymorphisms and somatic mutations found in human proteins (LS-SNP, LS-Mut). MODBASE models are also available through the Protein Model Portal (http://www.proteinmodelportal.org/)

PubMed Central

eScholarship - University of California

MODBASE: a database of annotated comparative protein structure models and associated resources

Author: Braberg Hannes
Davis Fred P.
Eramian David
Eswar Narayanan
Karchin Rachel
Kelly Libusha
Madhusudhan M. S.
Marti-Renom Marc
Melo Francisco
Pieper Ursula
Rossi Andrea
Sali Andrej
Shen Min-Yi
Webb Ben M.
Publication venue: Oxford University Press
Publication date: 28/12/2005
Field of study

MODBASE () is a database of annotated comparative protein structure models for all available protein sequences that can be matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on MODELLER for fold assignment, sequence–structure alignment, model building and model assessment (). MODBASE is updated regularly to reflect the growth in protein sequence and structure databases, and improvements in the software for calculating the models. MODBASE currently contains 3 094 524 reliable models for domains in 1 094 750 out of 1 817 889 unique protein sequences in the UniProt database (July 5, 2005); only models based on statistically significant alignments and models assessed to have the correct fold despite insignificant alignments are included. MODBASE also allows users to generate comparative models for proteins of interest with the automated modeling server MODWEB (). Our other resources integrated with MODBASE include comprehensive databases of multiple protein structure alignments (DBAli, ), structurally defined ligand binding sites and structurally defined binary domain interfaces (PIBASE, ) as well as predictions of ligand binding sites, interactions between yeast proteins, and functional consequences of human nsSNPs (LS-SNP, )

Crossref

PubMed Central

GeMMA: functional subfamily classification within superfamilies of predicted protein structural domains

Author: Abascal
Abhiman
Addou
Alexeyenko
Andreeva
Attwood
Berman
Brenner
Brown
Brown
Bru
Chen
Christine Orengo
Cuff
David A. Lee
Dessailly
Devos
Edgar
Eisen
Engelhardt
Enright
Eramian
Finn
Friedberg
Godzik
Haft
Jensen
John
Kaplan
Katoh
Kersey
Krishnamurthy
Lee
Letunic
Li
Loewenstein
Mulder
O’Brien
Pegg
Petryszak
Pieper
Reeves
Rentzsch
Robert Rentzsch
Rost
Sadreyev
Sali
Sigrist
Thomas
Tian
Wicker
Wilson
Wu
Yeats
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

GeMMA (Genome Modelling and Model Annotation) is a new approach to automatic functional subfamily classification within families and superfamilies of protein sequences. A major advantage of GeMMA is its ability to subclassify very large and diverse superfamilies with tens of thousands of members, without the need for an initial multiple sequence alignment. Its performance is shown to be comparable to the established high-performance method SCI-PHY. GeMMA follows an agglomerative clustering protocol that uses existing software for sensitive and accurate multiple sequence alignment and profile–profile comparison. The produced subfamilies are shown to be equivalent in quality whether whole protein sequences are used or just the sequences of component predicted structural domains. A faster, heuristic version of GeMMA that also uses distributed computing is shown to maintain the performance levels of the original implementation. The use of GeMMA to increase the functional annotation coverage of functionally diverse Pfam families is demonstrated. It is further shown how GeMMA clusters can help to predict the impact of experimentally determining a protein domain structure on comparative protein modelling coverage, in the context of structural genomics

CiteSeerX

Crossref

PubMed Central

Regulatory Elements within the Prodomain of Falcipain-2, a Cysteine Protease of the Malaria Parasite Plasmodium falciparum

Author: A Fiser
Andrej Sali
Ashley M. Buckle
B Wiederanders
BM Greenwood
BR Shenai
D Eramian
D Turk
David T. Barkan
J Guay
JM LaLonde
Kailash C. Pandey
KC Pandey
KC Pandey
KC Pandey
KM Karrer
M Cygler
MA Marti-Renom
MR Groves
MR Groves
MY Shen
N Eswar
Philip J. Rosenthal
PJ Rosenthal
PS Sijwali
PS Sijwali
PS Sijwali
R Coulombe
R Korde
S Kreusch
S Subramanian
SX Wang
T Vernet
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Falcipain-2, a papain family cysteine protease of the malaria parasite Plasmodium falciparum, plays a key role in parasite hydrolysis of hemoglobin and is a potential chemotherapeutic target. As with many proteases, falcipain-2 is synthesized as a zymogen, and the prodomain inhibits activity of the mature enzyme. To investigate the mechanism of regulation of falcipain-2 by its prodomain, we expressed constructs encoding different portions of the prodomain and tested their ability to inhibit recombinant mature falcipain-2. We identified a C-terminal segment (Leu155–Asp243) of the prodomain, including two motifs (ERFNIN and GNFD) that are conserved in cathepsin L sub-family papain family proteases, as the mediator of prodomain inhibitory activity. Circular dichroism analysis showed that the prodomain including the C-terminal segment, but not constructs lacking this segment, was rich in secondary structure, suggesting that the segment plays a crucial role in protein folding. The falcipain-2 prodomain also efficiently inhibited other papain family proteases, including cathepsin K, cathepsin L, cathepsin B, and cruzain, but it did not inhibit cathepsin C or tested proteases of other classes. A structural model of pro-falcipain-2 was constructed by homology modeling based on crystallographic structures of mature falcipain-2, procathepsin K, procathepsin L, and procaricain, offering insights into the nature of the interaction between the prodomain and mature domain of falcipain-2 as well as into the broad specificity of inhibitory activity of the falcipain-2 prodomain

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Using neural networks and evolutionary information in decoy discrimination for protein tertiary structure prediction

Author: A Zemla
AG Murzin
B Park
B Rost
B Wallner
BA Reva
BH Park
C Keasar
CH Wu
Ching-Wai Tan
CS Pettitt
D Eramian
D Shortle
David T Jones
DT Jones
DT Jones
DT Jones
J Moult
J Tsai
KT Simons
LJ McGuffin
M Fasnacht
M Wiederstein
MI Sadowski
MJ Sippl
N Siew
R Samudrala
R Samudrala
SCE Tosatto
SF Altschul
W Kabsch
Y Xia
Y Zhang
Y Zhang
Publication venue: BioMed Central
Publication date: 01/02/2008
Field of study

Background: We present a novel method of protein fold decoy discrimination using machine learning, more specifically using neural networks. Here, decoy discrimination is represented as a machine learning problem, where neural networks are used to learn the native-like features of protein structures using a set of positive and negative training examples. A set of native protein structures provides the positive training examples, while negative training examples are simulated decoy structures obtained by reversing the sequences of native structures. Various features are extracted from the training dataset of positive and negative examples and used as inputs to the neural networks.Results: Results have shown that the best performing neural network is the one that uses input information comprising of PSI-BLAST [1] profiles of residue pairs, pairwise distance and the relative solvent accessibilities of the residues. This neural network is the best among all methods tested in discriminating the native structure from a set of decoys for all decoy datasets tested. Conclusion: This method is demonstrated to be viable, and furthermore evolutionary information is successfully used in the neural networks to improve decoy discrimination

Crossref

Directory of Open Access Journals

PubMed Central

UCL Discovery

Trends in template/fragment-free protein structure prediction

Author: A BenNaim
A Cavalli
A Elofsson
A Grossfield
A Jagielska
A Liwo
A Liwo
A Pillardy
A Warshel
A Warshel
A Warshel
AE Roitberg
AF Voter
AP Lyubartsev
AR Ortiz
AR Panchenko
AV Morozov
B Fain
B Roux
B Xue
B Zagrovic
BR Brooks
C Alsenoy Van
C Bystroff
C Hardin
C Hoppe
C Simmerling
C Simmerling
C Zhang
C Zhang
C Zhang
C Zhang
C Zhang
CL Brooks
CM Deane
CM Summa
D Chivian
D Eramian
D Gilis
D Hamelberg
D Jiao
D Katagiri
D Kihara
D Kim
DE Kim
DE Shaw
DS Wishart
DT Jones
E Faraggi
E Faraggi
E Ferrada
E Ferrada
E Haber
E Krieger
E Marinari
E Pettersson
Eshel Faraggi
F Wagner
F Zhao
F Zhao
F Zhao
FG Wang
G Chopra
G Cornilescu
G Pollastri
G Yona
GA Kaminski
GA Papoian
GM Torrie
GR Bowman
H Fan
H Kamberaj
H Kamisetty
H Lu
H Zhou
H Zhou
Hongxing Lei
HP Gong
HS Kang
HX Lei
HX Lei
HX Lei
HX Lei
HX Lei
HY Liu
HY Zhou
HZ Li
J Cheng
J DeBartolo
J DeBartolo
J Lundstrom
J Meiler
J Moult
J Pei
J Shi
J Skolnick
J Vreede
J Wang
J Xu
J Zhu
J Zhu
JA Hegler
JA McCammon
JA McCammon
JA Vila
JE Stone
JF Gibrat
JL Gao
JL Knight
JM Bujnicki
JM Bujnicki
JP Ma
JP Piquemal
JW Pitera
K Karplus
KT Simons
LA Kelley
LC Song
LJ Yang
LJ Yang
LJ Yang
LQ Zheng
M Ben-David
M Challacombe
M Christen
M Lu
M Lu
M Masella
M Mirzaie
M Nanias
M Stork
M Vieth
MJ Rooman
MJ Sippl
MM Seibert
MR Betancourt
MR Lee
MS Friedrichs
MS Lin
MS Shell
MY Shen
N Todorova
N Yu
N Yu
NV Buchete
O Dor
O Dor
O Zimmermann
P Bradley
P Robustelli
P Sherwood
PA Bash
PD Renfrew
PD Thomas
PEM Lopes
PH Maccallum
PH Maccallum
PI Bakker de
R Kuang
R Paulini
R Samudrala
R Srinivasan
RW Montalvao
S Brown
S Chowdhury
S Kannan
S Liu
S Miyazawa
S Miyazawa
S Neal
S Oldziej
S Patel
S Piana
S Piana
S Roy
S Tanaka
SB Ozkan
SF Altschul
SJ Weiner
T Hamelryck
T Kortemme
T Lazaridis
T Yoshidome
TC Terwilliger
TJ Brunette
U Ryde
UHE Hansmann
V Leone
V Tozzini
V Tozzini
V Tsui
VA Eyrich
W Blokzijl
W Boomsma
W Xie
W Zhang
WS Xie
WW Chen
X Zhu
XF Li
XP Xu
Y Duan
Y Duan
Y Shan
Y Shen
Y Shen
Y Sugita
Y Zhang
Y Zhang
Y Zhou
Yaoqi Zhou
YD Yang
YD Yang
YD Yang
YG Mu
YH Tan
YH Wu
Yong Duan
YQ Gao
YQ Gao
Yuedong Yang
YX Liu
Z Wang
ZX Wang
Publication venue: Springer-Verlag
Publication date: 01/01/2010
Field of study

Predicting the structure of a protein from its amino acid sequence is a long-standing unsolved problem in computational biology. Its solution would be of both fundamental and practical importance as the gap between the number of known sequences and the number of experimentally solved structures widens rapidly. Currently, the most successful approaches are based on fragment/template reassembly. Lacking progress in template-free structure prediction calls for novel ideas and approaches. This article reviews trends in the development of physical and specific knowledge-based energy functions as well as sampling techniques for fragment-free structure prediction. Recent physical- and knowledge-based studies demonstrated that it is possible to sample and predict highly accurate protein structures without borrowing native fragments from known protein structures. These emerging approaches with fully flexible sampling have the potential to move the field forward

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Recommended from our members

ASSESSMENT AND PREDICTION OF PROTEIN STRUCTURES

Author: Eramian David Edward
Publication venue: eScholarship, University of California
Publication date: 01/01/2008
Field of study

An ambitious goal of modern biology is to understand the structure(s), interaction(s) and function(s) of each protein within cells and organisms. Understanding the nature of the interactions a protein makes is important because no protein exists in isolation, but rather functions through interactions with other macromolecules. Knowledge about the function of proteins is essential to understanding biological processes. Structure is the unifying component: both interactions and functions are intrinsically related to structure, as the structure of a protein helps define its function and affects the nature, type, and number of interactions it has with other macromolecules. Great attention has been paid to the development of methods for both the theoretical prediction and experimental determination of protein structure. Though experimentally-derived structures are more accurate, they are relatively scarce: of the millions of known protein sequences, well fewer than 1% of their corresponding structures have been solved experimentally. In the absence of an experimentally determined structure, computational models are often valuable for generating testable hypotheses and giving insight into existing experimental data. Such computational structure models are available for over two orders of magnitude more protein sequences than are experimentally determined structures, yet suffer from two limitations that experimentally determined structures do not: they frequently contain significant errors, and their accuracy cannot be readily assessed. The research described herein sought to increase the accuracy and applicability of computational protein models by addressing these two limitations. This broad goal was approached in four principal ways: (1) identifying the most native-like models from among sets of similar models; (2) predicting the absolute accuracy of protein structure models; (3) improving the accuracy of target/template alignments to increase the accuracy of comparative models built from distantly related template structures; and (4) developing a unified protein structure prediction protocol that makes the best use of all available information about the structure of a given protein, regardless of whether it is directly based on experiment, on the broader knowledge base, on statistical potentials, or intuition

eScholarship - University of California

Feature-rich distance-based terrain synthesis

Author: Eramian M. (Mark)
Mould D. (David)
Rusnell B. (Brennan)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2009
Field of study

This paper describes a novel terrain synthesis method based on distances in a weighted graph. A height field is determined by least-cost paths in a weighted graph from a set of generator nodes. The shapes of individual terrain features, such as mountains, hills, and craters, are specified by a monotonically decreasing profile describing the cross-sectional shape of a feature. The locations of features in the terrain are specified by placing the generators; secondary ridges are placed by pathing. We show the method to be robust and easy to control, even making it possible to embed images in terrain shadows. The method can produce a wide range of realistic synthetic terrains such as mountain ranges, craters, cinder cones, and hills. The ability to manually place terrain features that incorporate multiple profiles produces heterogeneous terrains that compare favorably to existing methods

Carleton University's Institutional Repository